Coordinated Science Laboratory at the University of Illinois at Urbana-Champaign
Abstract:We present FalconWing -- an open-source, ultra-lightweight (150 g) fixed-wing platform for autonomy research. The hardware platform integrates a small camera, a standard airframe, offboard computation, and radio communication for manual overrides. We demonstrate FalconWing's capabilities by developing and deploying a purely vision-based control policy for autonomous landing (without IMU or motion capture) using a novel real-to-sim-to-real learning approach. Our learning approach: (1) constructs a photorealistic simulation environment via 3D Gaussian splatting trained on real-world images; (2) identifies nonlinear dynamics from vision-estimated real-flight data; and (3) trains a multi-modal Vision Transformer (ViT) policy through simulation-only imitation learning. The ViT architecture fuses single RGB image with the history of control actions via self-attention, preserving temporal context while maintaining real-time 20 Hz inference. When deployed zero-shot on the hardware platform, this policy achieves an 80% success rate in vision-based autonomous landings. Together with the hardware specifications, we also open-source the system dynamics, the software for photorealistic simulator and the learning approach.
Abstract:We present the first framework demonstrating zero-shot sim-to-real transfer of visual control policies learned in a Neural Radiance Field (NeRF) environment for quadrotors to fly through racing gates. Robust transfer from simulation to real flight poses a major challenge, as standard simulators often lack sufficient visual fidelity. To address this, we construct a photorealistic simulation environment of quadrotor racing tracks, called FalconGym, which provides effectively unlimited synthetic images for training. Within FalconGym, we develop a pipelined approach for crossing gates that combines (i) a Neural Pose Estimator (NPE) coupled with a Kalman filter to reliably infer quadrotor poses from single-frame RGB images and IMU data, and (ii) a self-attention-based multi-modal controller that adaptively integrates visual features and pose estimation. This multi-modal design compensates for perception noise and intermittent gate visibility. We train this controller purely in FalconGym with imitation learning and deploy the resulting policy to real hardware with no additional fine-tuning. Simulation experiments on three distinct tracks (circle, U-turn and figure-8) demonstrate that our controller outperforms a vision-only state-of-the-art baseline in both success rate and gate-crossing accuracy. In 30 live hardware flights spanning three tracks and 120 gates, our controller achieves a 95.8% success rate and an average error of just 10 cm when flying through 38 cm-radius gates.
Abstract:Autonomous air taxis are poised to revolutionize urban mass transportation, however, ensuring their safety and reliability remains an open challenge. Validating autonomy solutions on air taxis in the real world presents complexities, risks, and costs that further convolute this challenge. Verification and Validation (V&V) frameworks play a crucial role in the design and development of highly reliable systems by formally verifying safety properties and validating algorithm behavior across diverse operational scenarios. Advancements in high-fidelity simulators have significantly enhanced their capability to emulate real-world conditions, encouraging their use for validating autonomous air taxi solutions, especially during early development stages. This evolution underscores the growing importance of simulation environments, not only as complementary tools to real-world testing but as essential platforms for evaluating algorithms in a controlled, reproducible, and scalable manner. This work presents a V&V framework for a vision-based landing system for air taxis with vertical take-off and landing (VTOL) capabilities. Specifically, we use Verse, a tool for formal verification, to model and verify the safety of the system by obtaining and analyzing the reachable sets. To conduct this analysis, we utilize a photorealistic simulation environment. The simulation environment, built on Unreal Engine, provides realistic terrain, weather, and sensor characteristics to emulate real-world conditions with high fidelity. To validate the safety analysis results, we conduct extensive scenario-based testing to assess the reachability set and robustness of the landing algorithm in various conditions. This approach showcases the representativeness of high-fidelity simulators, offering an effective means to analyze and refine algorithms before real-world deployment.
Abstract:Testing Automated Driving Systems (ADS) in simulation with realistic driving scenarios is important for verifying their performance. However, converting real-world driving videos into simulation scenarios is a significant challenge due to the complexity of interpreting high-dimensional video data and the time-consuming nature of precise manual scenario reconstruction. In this work, we propose a novel framework that automates the conversion of real-world car crash videos into detailed simulation scenarios for ADS testing. Our approach leverages prompt-engineered Video Language Models(VLM) to transform dashcam footage into SCENIC scripts, which define the environment and driving behaviors in the CARLA simulator, enabling the generation of realistic simulation scenarios. Importantly, rather than solely aiming for one-to-one scenario reconstruction, our framework focuses on capturing the essential driving behaviors from the original video while offering flexibility in parameters such as weather or road conditions to facilitate search-based testing. Additionally, we introduce a similarity metric that helps iteratively refine the generated scenario through feedback by comparing key features of driving behaviors between the real and simulated videos. Our preliminary results demonstrate substantial time efficiency, finishing the real-to-sim conversion in minutes with full automation and no human intervention, while maintaining high fidelity to the original driving events.
Abstract:This paper addresses the problem of visual target tracking in scenarios where a pursuer may experience intermittent loss of visibility of the target. The design of a Switched Visual Tracker (SVT) is presented which aims to meet the competing requirements of maintaining both proximity and visibility. SVT alternates between a visual tracking mode for following the target, and a recovery mode for regaining visual contact when the target falls out of sight. We establish the stability of SVT by extending the average dwell time theorem from switched systems theory, which may be of independent interest. Our implementation of SVT on an Agilicious drone [1] illustrates its effectiveness on tracking various target trajectories: it reduces the average tracking error by up to 45% and significantly improves visibility duration compared to a baseline algorithm. The results show that our approach effectively handles intermittent vision loss, offering enhanced robustness and adaptability for real-world autonomous missions. Additionally, we demonstrate how the stability analysis provides valuable guidance for selecting parameters, such as tracking speed and recovery distance, to optimize the SVT's performance.
Abstract:Large Language Models (LLMs) have shown remarkable capabilities in natural language processing, mathematical problem solving, and tasks related to program synthesis. However, their effectiveness in long-term planning and higher-order reasoning has been noted to be limited and fragile. This paper explores an approach for enhancing LLM performance in solving a classical robotic planning task by integrating solver-generated feedback. We explore four different strategies for providing feedback, including visual feedback, we utilize fine-tuning, and we evaluate the performance of three different LLMs across a 10 standard and 100 more randomly generated planning problems. Our results suggest that the solver-generated feedback improves the LLM's ability to solve the moderately difficult problems, but the harder problems still remain out of reach. The study provides detailed analysis of the effects of the different hinting strategies and the different planning tendencies of the evaluated LLMs.
Abstract:Perception contracts provide a method for evaluating safety of control systems that use machine learning for perception. A perception contract is a specification for testing the ML components, and it gives a method for proving end-to-end system-level safety requirements. The feasibility of contract-based testing and assurance was established earlier in the context of straight lane keeping: a 3-dimensional system with relatively simple dynamics. This paper presents the analysis of two 6 and 12-dimensional flight control systems that use multi-stage, heterogeneous, ML-enabled perception. The paper advances methodology by introducing an algorithm for constructing data and requirement guided refinement of perception contracts (DaRePC). The resulting analysis provides testable contracts which establish the state and environment conditions under which an aircraft can safety touchdown on the runway and a drone can safely pass through a sequence of gates. It can also discover conditions (e.g., low-horizon sun) that can possibly violate the safety of the vision-based control system.
Abstract:A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present a new approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee safety and leverage machine learning technologies for scalability. We have implemented this algorithm and present experimental results comparing our approach with state-of-the-art reachability and simulation-based RTA approaches in a number of scenarios using aircraft models in 3D space with complex safety requirements. Our approach can guarantee safety while increasing utilization of the experimental controller over existing approaches.
Abstract:Perception modules are integral in many modern autonomous systems, but their accuracy can be subject to the vagaries of the environment. In this paper, we propose a learning-based approach that can automatically characterize the error of a perception module from data and use this for safe control. The proposed approach constructs a {\em perception contract (PC)\/} which generates a set that contains the ground-truth value that is being estimated by the perception module, with high probability. We apply the proposed approach to study a vision pipeline deployed on a quadcopter. With the proposed approach, we successfully constructed a PC for the vision pipeline. We then designed a control algorithm that utilizes the learned PC, with the goal of landing the quadcopter safely on a landing pad. Experiments show that with the learned PC, the control algorithm safely landed the quadcopter despite the error from the perception module, while the baseline algorithm without using the learned PC failed to do so.
Abstract:ChatGPT is an AI language model developed by OpenAI that can understand and generate human-like text. It can be used for a variety of use cases such as language generation, question answering, text summarization, chatbot development, language translation, sentiment analysis, content creation, personalization, text completion, and storytelling. While ChatGPT has garnered significant positive attention, it has also generated a sense of apprehension and uncertainty in academic circles. There is concern that students may leverage ChatGPT to complete take-home assignments and exams and obtain favorable grades without genuinely acquiring knowledge. This paper adopts a quantitative approach to demonstrate ChatGPT's high degree of unreliability in answering a diverse range of questions pertaining to topics in undergraduate computer science. Our analysis shows that students may risk self-sabotage by blindly depending on ChatGPT to complete assignments and exams. We build upon this analysis to provide constructive recommendations to both students and instructors.